corresponding sink downstream, so that the event is diverted out, But here is a premise: do not recommend through the body of the event to set the header, because Flume is a sink, the sink is not in the middle of the water processing, to be processed, such as water out of the reprocessingA1.sources.r1.interceptors = I1a1.sources.r1.interceptors.i1.type = Hosta1.sources.r1.interceptors.i1.hostHeader = HostnameAs above, host is your custom interceptor
From the above information, you can see the problem, the server and the client connection information is not on, the server has a lot of established connection, in fact, useless. This situation, at first, I was also very strange, did not find the reason, can only view the log.Through the log information, it was found that an exception occurred, but it is strange that before the exception information, there is an RPC sink {} Closing RPC client: {}Here DestroyConnection, destroyed a connection, wh
of data senders in the journaling system for data collection, while Flume provides the ability to simply process data and write to various data recipients (such as text, HDFS, hbase, etc.).The flume data stream runs through events (event) throughout. An event is a flume base unit of data that carries log data (in byte
least 4w+ EVENTS/SEC, upper limit up to 7w+ events/sec. Of course, the exact data is related to the machine hardware configuration, but we can evaluate whether the flume meets the performance requirements of the actual business. In addition, the results show that the maximum throughput of a single machine is related to the concurrency of the
polled or the modification time was changed after the last poll. Renaming and moving a document does not change the time the document was modified. When a flume agent polls a non-existent document, one of the following two scenarios occurs: 1. When the configuration file is not polled in the specified directory, the agent determines his behavior based on the attribute of the Flume.called.from.service property. If this property is set, polling is done
Flume and Kakfa example (KAKFA as Flume sink output to Kafka topic)To prepare the work:$sudo mkdir-p/flume/web_spooldir$sudo chmod a+w-r/flumeTo edit a flume configuration file:$ cat/home/tester/flafka/spooldir_kafka.conf# Name The components in this agentAgent1.sources = WeblogsrcAgent1.sinks = Kafka-sinkAgent1.channe
# Add environment configuration export flume_home =/usr/local /flume export Path =.: $ path: $ flume_home/bin $ source/etc/profile $ Flume # verify Installation
2. Select one or more nodes as the master node.
For Master selection, you can define a master on the cluster, or you can select multiple nodes as the master to improve availability.
Single-point master mode: easy to manage, but flawed in System F
In flume1.5.2, if you want to get flume related metrics through HTTP monitoring, add the following after the startup script:-dflume.monitoring.type=http-dflume.monitoring.port=34545MonitoringThe-D attribute can be obtained directly through system.getproerties (), so the above two properties are read by Method Loadmonitoring (), and the method is flume in the portal application private void Loadmonitoring ()
.processor.maxpenalty=10000#define the sink 1a1.sinks.k1.type=avroa1.sinks.k1.hostname= 192.168.11.179a1.sinks.k1.port=9876#define the sink 2a1.sinks.k2.type=avroa1.sinks.k2.hostname= 192.168.11.178a1.sinks.k2.port=9876# use a channel which buffers events in Memorya1.channels.c1.type = Memorya1.channels . c1.capacity = 1000a1.channels.c1.transactioncapacity = 100# Bind The source and sink to the Channela1.sources.r1.channels = C1a1.sinks.k1.channel =
Java.nio.charset.Charset;
public class MyApp {
public static void Main (string[] args) {
Myrpcclientfacade client = new Myrpcclientfacade ();
Initialize client with the remote Flume agent ' s host and port
Client.init ("host.example.org", 41414);
Send the events to the remote Flume agent. That agent should is
Configured to listen with an avrosource
= 1073741824Common sink? Log Console: Logger sinkA1.sinks.k1.type = Logger? stored in local files: File roll Sink#设置滚动文件sinkA1.sinks.k1.type = File_roll#指定文件位置. If the file does not exist, it will errorA1.sinks.k1.directory =/home/centos/log2#设置滚动周期间隔, 0 does not scroll; default 30s.A1.sinks.k1.sink.rollInterval = 0Write to Hdfsl:hdfs sink//default Sequencefile, can be specified by Hdfs.filetype (Sequencefile, DataStream or Compressedstream)#指定类型A1.sinks.k1.type = HDFs#指定路径 without creating a
latest architectures.2. Environmental requirementsJava Runtime Environment-Java version 1.7 or laterMemory-sources, channels or sinks, need to be configured with enough memoryDisk space-sufficient disk space for the configuration used by channels or sinksDirectory permissions-read/write permissions for the directory used by the agent3. Data flow modelThe flume event is defined as a dataflow unit that contains payload bytes and an optional set of stri
operationSeems to be here ... It seems to be finished ... Reader friends do not scold me, because Logstash is so simple, all the code integration, the programmer does not need to care about how it works.Logstash most noteworthy is that in the Filter plugin section has a relatively complete function, such as Grok, through regular parsing and structure of any text, Grok is currently the best way to parse unstructured log data into a structured and queryable. In addition, Logstash can rename, dele
I haven't written a blog for a long time. We have recently studied storm, flume, and Kafka. Today, I will write down the scenarios and conclusions for testing flume failover and load balance;
The test environment contains five configuration files, that is, five agents.
A main configuration file, that is, the configuration file (flume-sink.properties) for configur
# describe/configure the Sourcea1.sources.r1.type = Execa1.sources.r1.channel s = C1a1.sources.r1.command = Tail-f/home/sky/flume/log_exec_tail# Describe The Sinka1.sinks.k1.type = logger# use a cha Nnel which buffers events in Memorya1.channels.c1.type = Memorya1.channels.c1.capacity = 1000a1.channels.c1.transactioncapacity = 100# Bind The source and sink to the Channela1.sources.r1.channels = C1A1.SINKS.K
agent configurationA flume agent can configure multiple flow, and we can have multiple sources,channel,sink. The combination of these components can form multiple data flows.usage Scenario : Multiple flows can be configured at the same time, and the two streams interfere with each other. Take the following example,A stream is: avro-appsrv-source1-->mem-channel-1--> mem-channel-1The other one is: exec-tail-source2--> File-channel-2-File-channel-2#list
reaches that number, the temporary file is scrolled into the target file, and if set to 0, the file is not scrolled according to the events data
Table 16. Run the flume agent to save the previous step settings, and then restart the Flume service, as shown in 2.Figure 2After the reboot, the status file has been logged with the latest ID value of 7, as sh
used for flume deployment debugging, it will receive the event events directly with log4j output, Rollingfilesink: This sink is mainly to serialize the received log file into a file directory, So need to configure the address of the file, the frequency of slicing files, etc., Avrosink: This is the most common flume layered architecture sink, general and Avrosour
/FlumeDeveloperGuide.htmlIi. Characteristics of FlumeFlume is a distributed, reliable, and highly available system for collecting, aggregating, and transmitting large volumes of logs. Support for customizing various data senders in the log system for data collection, while Flume provides simple processing of dataand write about the ability of various data recipients (such as text, HDFS, hbase, etc.).Flume d
1, source is HTTP mode, sink is logger mode, the data is printed in the console. The conf configuration file is as follows: # Name The components in this agenta1.sources = R1a1.sinks = K1a1.channels = c1# Describe/configure the S Ourcea1.sources.r1.type = http #该设置表示接收通过http方式发送过来的数据a1. sources.r1.bind = hadoop-master # The host or IP address running flume can be a1.sources.r1.port = 9000# Port #a1.sources.r1.fileheader = true# Describe the Sinka1.sin
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.